Method and apparatus for maintaining consistent per-hop forwarding behavior in a network using network-wide per-hop behavior definitions

ABSTRACT

A method, apparatus, and computer-readable medium configured for maintaining consistent per-hop packet forwarding behavior among a plurality of network devices in a network within a Differentiated Services (DS) domain are disclosed. In one aspect, a method involves creating and storing a network-wide PHB definition that associates a PHB with a DS code point (DSCP) value, and with a set of parameters that define the bandwidth and buffer resources allocated to the PHBs on all interfaces of network devices within the DS domain. A mapping of each of the PHBs in the network-wide PHB definition to one or more queues of the network devices is determined. Drain size and queue size values are determined for each of the queues to which PHBs are mapped. A mapping of each of the PHBs to a threshold value associated with the queues is determined. Parameters of fragmentation and interleave mechanisms are determined. Network device configuration parameter values based on the mappings, the drain size, and the queue size, etc. are sent to each of the network devices within the DS domain. As a result, consistent PHB is achieved throughout a network using abstract definitions of PHBs.

RELATED APPLICATION

This application is a continuation of and claims priority to U.S. patentapplication Ser. No. 09/753,034, filed on Dec. 28, 2000 now U.S. Pat.No. 6,839,327, entitled “METHOD AND APPARATUS FOR MAINTAINING CONSISTENTPER-HOP FORWARDING BEHAVIOR IN A NETWORK USING NETWORK-WIDE PER-HOPBEHAVIOR DEFINITIONS,” naming as inventors Arthur Zavalkovsky, NitsanElfassy, and Ron Cohen, which claims domestic priority from priorProvisional application Ser. No. 60/250,749, filed Dec. 01, 2000,entitled “METHOD AND APPARATUS FOR MAINTAINING CONSISTENT PER-HOPROUTING BEHAVIOR IN A NETWORK USING NETWORK-WIDE SERVICE DEFINITIONS,”naming as inventors Arthur Zavalkovsky, Nitsan Elfassy, and Ron Cohen,the contents both of which are hereby incorporated by reference in theirentirety for all purposes.

FIELD OF THE INVENTION

The present invention relates generally to computer networking. Theinvention relates more specifically to a method and apparatus formaintaining consistent per-hop packet forwarding behavior in a networkusing network-wide per-hop behavior definitions.

BACKGROUND OF THE INVENTION

A computer network typically comprises a plurality of interconnectedentities that transmit (“source”) or receive (“sink”) data frames. Acommon type of computer network is a local area network (“LAN”) thatgenerally comprises a privately owned network within a single buildingor campus. LANs employ a data communication protocol (LAN standard) suchas Ethernet, FDDI, or Token Ring, that defines the functions performedby the data link and physical layers of a communications architecture(i.e., a protocol stack), such as the Open Systems Interconnection (OSI)Reference Model. In many instances, multiple LANs may be interconnectedby point-to-point links, microwave transceivers, satellite hookups,etc., to form a wide area network (“WAN”), metropolitan area network(“MAN”) or Intranet. These internetworks may be coupled through one ormore gateways to the global, packet-switched internetwork generallyknown as the Internet or World Wide Web (WWW).

Each network entity preferably includes network communication software,which may operate in accordance with Transport Control Protocol/InternetProtocol (TCP/IP). TCP/IP generally consists of a set of rules defininghow entities interact with each other. In particular, TCP/IP defines aseries of communication layers, including a transport layer and anetwork layer. At the transport layer, TCP/IP includes both the UserData Protocol (UDP), which is a connectionless transport protocol, andTCP, which is a reliable, connection-oriented transport protocol. When aprocess at one network entity wishes to communicate with another entity,it formulates one or more messages and passes them to the upper layer ofthe TCP/IP communication stack. These messages are passed down througheach layer of the stack where they are encapsulated into packets andframes. Each layer also adds information in the form of a header to themessages. The frames are then transmitted over the network links asbits. At the destination entity, the bits are re-assembled and passed upthe layers of the destination entity's communication stack. At eachlayer, the corresponding message headers are stripped off, therebyrecovering the original message that is handed to the receiving process.

One or more intermediate network devices are often used to couple LANstogether and allow the corresponding entities to exchange information.For example, a bridge may be used to provide a “bridging” functionbetween two or more LANs. Alternatively, a switch may be utilized toprovide a “switching” function for transferring information, such asdata frames or packets, among entities of a computer network. Typically,the switch is a computer having a plurality of ports that couple theswitch to several LANs and to other switches. The switching functionincludes receiving data frames at a source port and transferring them toat least one destination port for receipt by another entity. Switchesmay operate at various levels of the communication stack. For example, aswitch may operate at Layer 2, which in the OSI Reference Model, iscalled the data link layer, and includes the Logical Link Control (LLC)and Media Access Control (MAC) sub-layers.

Other intermediate devices, commonly known as routers, may operate athigher communication layers, such as Layer 3, which in TCP/IP networkscorresponds to the Internet Protocol (IP) layer. Conventionally, IP datapackets include a corresponding header that contains an IP sourceaddress and an IP destination address. Routers or Layer 3 switches mayre-assemble or convert received data frames from one LAN standard (e.g.,Ethernet) to another (e.g., Token Ring). Thus, Layer 3 devices are oftenused to interconnect dissimilar subnetworks. Some Layer 3 intermediatenetwork devices may also examine the transport layer headers of receivedmessages to identify the corresponding TCP or UDP port numbers beingutilized by the corresponding network entities. Many applications areassigned specific, fixed TCP and/or UDP port numbers in accordance withRequest For Comments (RFC) 1700. For example, TCP/UDP port number 80corresponds to the Hypertext Transport Protocol (HTTP), while portnumber 21 corresponds to File Transfer Protocol (FTP) service.

A process executing at a network entity may generate hundreds orthousands of traffic flows that are transmitted across a network.Generally, a traffic flow is a set of messages (frames and/or packets)that typically correspond to a particular task, transaction or operation(e.g., a print transaction) and may be identified by various network andtransport parameters, such as source and destination IP addresses,source and destination TCP/UDP port numbers, and transport protocol.

The treatments that are applied to different traffic flows may varydepending on the particular traffic flow at issue. For example, anonline trading application may generate stock quote messages, stocktransaction messages, transaction status messages, corporate financialinformation messages, print messages, data backup messages, etc. Anetwork administrator may wish to apply a different policy or servicetreatment (“quality of service” or “QoS”) to each traffic flow. Inparticular, the network administrator may want a stock quote message tobe given higher priority than a print transaction. Similarly, a $1million stock transaction message for a premium client should beassigned higher priority than a $100 stock transaction message for astandard customer.

Computer networks include numerous services and resources for use inmoving traffic throughout the network. For example, different networklinks, such as Fast Ethernet, Asynchronous Transfer Mode (ATM) channels,network tunnels, satellite links, etc., offer unique speed and bandwidthcapabilities. Additionally, the intermediate devices also includespecific resources or services, such as number of priority queues,filter settings, availability of different queue selection strategies,congestion control algorithms, etc.

Individual frames or packets can be marked so that intermediate devicesmay treat them in a predetermined manner. For example, the Institute ofElectrical and Electronics Engineers (IEEE) describes additionalinformation for the MAC header of Data Link Layer frames in Appendix802.1p to the 802.1D bridge standard.

A Differentiated Services (DS) model is under development by theInternet Differentiated Services Working Group of the InternetEngineering Task Force (IETF). IETF is seeking to promote DS as the“right” approach for implementing scalable service differentiation inlarge networks. Network devices that implement DS are said to form a DSdomain. Such devices may include routers and switches.

The DS model comprises a number of functional elements that areimplemented in network nodes. These functional elements include a smallset of per-hop forwarding behaviors (PHBs) that routers or switches mayapply to packets, packet classification functions, and trafficconditioning functions, including metering, marking, shaping andpolicing functions. The DS architecture achieves scalability byimplementing complex classification and conditioning functions only atnetwork nodes located at network boundaries, and by applying specifiedper-hop behavior to aggregates of traffic which have been appropriatelymarked using the DS field in the IP version 4 (“IPv4”) or IPv6 headers.The PHBs are defined to permit a reasonably granular means of allocatingbuffer and bandwidth resources at each node among competing trafficstreams.

Information about DS may be found in: D. Grossman, “New Terminology forDiffserv,” draft-ietf-diffserv-new-terms-02.txt, November, 1999; V.Jacobson et al., “An Expedited Forwarding PHB,” RFC 2598, June 1998; F.Baker et al., “Management Information Base for the DifferentiatedServices Architecture,” draft-ietf-diffserv-mib-02.txt, March, 2000; J.Heinanen et al., “Assured Forwarding PHB Group,” RFC 2597, June, 1999;S. Blake et al., “An Architecture for Differentiated Services,” RFC2475, December, 1998; S. Brim et al., “Per Hop Behavior IdentificationCodes,” RFC 2836, May 2000; “Differential Service for the Internet,”available at the “diffserv.lcs.mit.edu” Web site; J. Wroclawski,“Applications, Flexibility, and Differential Services,” May, 1998,available in the Web site document“internet2.edu/qos/may98Workshop/presentations/Wroclawski/sld001.htm”;A. Campbell, “Differential Services: Services and Mechanisms, Lecture11,” Jan. 13, 2000, available in the Web site document“comet.ctr.columbia.edu/distributed/lectures/lecture11/sld001.htm.”

In general, the network administrator is responsible for defining theQoS provided within a network. A network administrator cannot beexpected to understand all of the details that are required to modifythe schema for an appropriate per-hop-behavior configuration that iseffective across the network. In the current DS model, there is nosimple way for an administrator to specify the relationship andinterdependencies between the different PHBs that are in effect in theDS domain. In particular, there is no simple way to allocate networkresources to each PHB in a device independent way.

Based on the foregoing, there is a clear need for a mechanism that canenable a network administrator to define a network-wide specification ofQoS PHBs deployed within a DS domain.

There is also a need for a mechanism will allow a network administratorto define the relationship and interdependencies between the differentper-hop-behaviors within a DS domain.

There is also a need for a mechanism for providing consistentper-hop-behavior device configuration derived from the network-wide PHBspecification thus ensuring end-to-end enforcement of the QoS PHBs.

The current DS approach does allow definition of relationships betweenthe different PHBs within an AF PHB group, but it does not provide aneasy way to define the relationship between resources allocated to theentire PHBs enforced on the domain. In particular, it does not provide away to specify the relationship between the AF PHB groups and therelationship to the EF and BE PHBs or any user-defined PHB. Inparticular, there is a need for a way to guarantee minimal bandwidth andbuffer resources to each PHB as well as to limit a PHB from exploitingmore than its share in these resources.

There is also a need for a way to automatically generate deviceconfiguration parameters for the PHBs so that device behavior matchesthe definition of the PHB.

SUMMARY OF THE INVENTION

The foregoing needs, and other needs that will become apparent from thefollowing description, are achieved in the present invention, whichcomprises, in certain embodiments, a method, apparatus, andcomputer-readable medium configured for maintaining consistent per-hoppacket forwarding behavior among a plurality of network devices in anetwork within a Differentiated Services (DS) domain are disclosed.

In one aspect, a method involves creating and storing a network-wideservice definition that associates a service with a DS code point (DSCP)value, and with a set of parameters that define the bandwidth and bufferresources allocated to the PHBs on all interfaces of network deviceswithin the DS domain. A mapping of each of the PHBs in the network-widePHB definition to one or more queues of the network devices isdetermined. Drain size and queue size values are determined for each ofthe queues to which PHBs are mapped. A mapping of each of the PHBs to athreshold value associated with the queues is determined. Parameters forfragmentation and interleave mechanisms are determined. Network deviceconfiguration parameter values based on the mappings, the drain size,and the queue size, etc. are sent to each of the network devices withinthe DS domain. As a result, consistent PHB is achieved throughout anetwork using abstract definitions of PHBs.

In one embodiment, a network-wide PHB table is defined, containingvalues of parameters that define properties of PHBs. These propertiesare mapped to detailed per-hop-behavior parameters.

Alternatively, the invention provides a schema that represents a set ofquality of service PHBs, and a translation mechanism for convertingrepresentations of PHBs into parameters that specify the requiredper-hop-behaviors for supporting the PHBs in network devices such asrouters and switches. The translation mechanism ensures that each deviceapplies the proper treatment to packets that are associated with aspecified PHB. The translation mechanism may involve creating andstoring queue scheduling mechanisms, configuring the schedulingmechanisms on each device, etc.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example, and not by wayof limitation, in the figures of the accompanying drawings and in whichlike reference numerals refer to similar elements and in which:

FIG. 1A is a simplified block diagram that conceptually illustratesaspects of an embodiment.

FIG. 1B is a simplified block diagram providing an alternativeconceptual illustration of aspects of an embodiment.

FIG. 1C is a block diagram of a computer network in which certainembodiments of the invention may be used.

FIG. 2 is an example of a PHB set schema that may be used to define aPHB set for a DS domain.

FIG. 3 is a block diagram of a process of determining configurationparameters for services.

FIG. 4 is a block diagram of a computer system with which an embodimentmay be carried out.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

A method and apparatus for maintaining consistent per-hop packetforwarding behavior in a network using network-wide PHB definitions. Inthe following description, for the purposes of explanation, numerousspecific details are set forth in order to provide a thoroughunderstanding of the present invention. It will be apparent, however, toone skilled in the art that the present invention may be practicedwithout these specific details. In other instances, well-knownstructures and devices are shown in block diagram form in order to avoidunnecessarily obscuring the present invention.

Conceptual Overview

Within a DS Domain, the network administrator needs to have the abilityto choose the set of PHBs that the administrator intends to use and tomodify or “tune” the different parameters for each PHB to achievedesired performance in the network. Because all PHBs defined within a DSdomain are required to share the domain's bandwidth and buffer resourcesfor each hop, tuning should be made on the PHB set selected for thedomain (that is, all PHBs defined for the domain), and not individuallyper each PHB.

In one embodiment, an abstract representation of a set of PHBs thatincludes a set of tuning parameters is used to define a network-wideservice definition for maintaining consistent per-hop-behaviors within aDS domain. In certain embodiments, the tuning parameters are definedsuch that they do not depend on the details of the individual QoSmechanisms that are implemented on each network device.

FIG. 1A is a simplified block diagram that conceptually illustratesaspects of an embodiment. A network-wide PHB table 2 is defined,containing values of parameters that define properties of PHBs. Asindicated by block 4, the properties are mapped to detailedper-hop-behavior parameters 6. Service table 2 provides a way todescribe relationships among PHB but not the means for configuring eachdevice, which is carried out during the mapping.

FIG. 1B is a simplified block diagram providing an alternativeconceptual illustration of aspects of an embodiment. In this embodimentthe invention provides a schema 8 that represents a set of quality ofservice PHBs 10, and a translation mechanism 14 for convertingrepresentations of PHBs into parameters 12 that specify the requiredper-hop-behaviors for supporting the PHBs in network devices such asrouters and switches.

Operational Context

FIG. 1C is a block diagram of a computer network 100 in which certainembodiments of the invention may be used. Generally, computer network100 includes one or more network devices 120, 122, 124, 126, a pluralityof workstations 116, 118, a policy management station 102, and a network128.

Network devices 120, 122 represent edge network devices such as routers,switches, or other similar or equivalent devices that are configured forcoloring packets within network 128. In one embodiment, network devices120, 122 are configured to execute the Cisco Internetworking OperatingSystem (IOS) and are capable of marking packets with DSCP values, i.e.,they are compatible with Differentiated Services. Such marking may becarried out using a marker or other software element or application thatruns under control of IOS, e.g., an agent or process.

Network devices 124, 126 represent internal network devices (“coredevices”) such as routers, switches, or other similar or equivalentdevices that are configured for forwarding packets within network 128based the color of each packet. In certain embodiments, network devices124, 126 are configured to execute the IOS and are compatible withDifferentiated Services. Network devices 120, 122 and network devices124, 126 may in fact represent similar or even identical device typesand/or models that are each configured to perform a designated functionwithin computer network 100.

Workstations 116, 118 may be personal computers, workstations, or othernetwork end stations at which work is done, such as printers, scanners,facsimile machines, etc. In certain embodiments, workstations 116, 118are network devices, such as bridges, gateways, routers or switches thatallow computer network 100 to connect to another network system. Forexample, workstation 116 may be an edge device that is configured forcoloring packet of a different DS domain. In certain embodiments,workstations 116, 118 execute one or more applications 112, 114.Applications 112, 114 may represent a variety of different computerapplications that execute on workstations 116, 118 respectively andwhich cause data to be sent and received over network 128.

Network 128 comprises any number of network devices. Network 128 mayform part of a LAN or WAN. In one embodiment, network 128 is apacket-switched IP network configured as a DS domain whereby treatmentof packets that flow through network 128 is controlled and managed byPolicy Management Station 102 and network devices 120, 122, 124, 126. Inthis example, the DS domain includes network devices 120, 122, 124, 126.

Policy Management Station 102 is a computer, or a group of hardware orsoftware components or processes that cooperate or execute in one ormore computer systems. In this example, Policy Management Station 102includes a policy coordinator 104 and one or more policy servers 106,108, 110, that are coupled to network devices 120, 122, 124, 126. In oneembodiment, policy coordinator 104 communicates with policy servers 106,108, 110 to configure the network devices 120, 122, 124, 126, to controlthe coloring and forwarding of packets within network 128. For example,to implement a particular per-hop behavior, policy coordinator 104 maydirect network devices 120, 122 to color the packets of all Voice OverIP (VOIP) flows with the color gold (high priority) and to color thepackets of all File Transfer Protocol (FTP) flows with the color Bronze(low priority). Each color corresponds to a particular PHB and isassociated with one or more QoS treatment parameters, e.g., apre-defined DSCP value and possibly other values or characteristics.Policy coordinator 104 may further direct network devices 124, 126 toapply a particular forwarding policy based on the particular color ofeach packet that is processed.

In one embodiment, Policy Management Station 102 provides a mechanismwhereby a network administrator may easily create and modify (“define”and “tune”) a domain PHB set to coordinate the application of multiplePHBs within a DS domain. An examples of a mechanism that may be used bya network administrator to define and tune a domain PHB set is describedin detail below. Once the domain PHB set is defined, policy coordinator104 communicates with the policy servers 206, 208, 210 to cause edgedevices 120, 122 and core devices 224, 226 to implement the domain PHBset definition.

An example of a commercial product suitable for use as Policy ManagementStation 102 is CiscoAssure QoS Policy Manager 1.0, commerciallyavailable from Cisco Systems, Inc.

Although the example embodiment of FIG. 1C shows two (2) workstations116, 118, three (3) policy servers 116, 108, 110, two (2) edge devices120, 122, and two (2) core devices 124, 126, in other practicalembodiments there may be any number of such elements. In addition,Policy Management Station 102 is provided as only an example of onemechanism that may be used to define a domain PHB set. Policy ManagementStation 102 may be configured as a single component or a plurality ofdifferent distributed components that are configured to define a domainPHB set for implementing QoS policies within a network. In addition, incertain embodiments, policy servers 106, 110 are coupled to network 128and may communicate with edge devices 120 and 122 over network 128.

Defining a Domain PHB Set

FIG. 2 illustrates an example of domain PHB set schema 200 that may beused to define a domain PHB set. In one embodiment, policy coordinator104 is configured to provide a user interface to allow a user, such as anetwork administrator, to define a domain PHB set for a particular DSdomain, such as domain PHB set schema 200.

In this example, schema 200 is configured as a table having data columns202, 204, 206, 208 that include specific parameter values that define aparticular PHB. Thus, each PHB comprises a plurality of parameter values,and a domain PHB set comprises a set of PHBs that apply to a particularDS domain. For example, column 202 represents a Best Effort (BE) PHBdefinition, columns 204, 206 respectively represent Assured Forwarding(AF11, AF12) PHBs definitions and column 208 represents an ExpeditedForwarding (EF) PHB definition.

One or more Domain PHB tuning parameter values are used to tune the setof PHBs that are contained within a DS domain. In the example of FIG. 2,the Domain PHB tuning parameters for schema 200 include a Name field210, DSCP field 212, a Forwarding Class field 214, a Default Class field216, an Immediate Forwarding field 218, a Reserved Bandwidth field 220,a Forwarding Factor field 222, a Reserved Packets field 224, a BufferFactor field 226, a Packet Size field 228, a Traffic Type field 230 anda Per Hop Delay field 232.

The value of Name field 210 identifies the name for the PHB. Examplesinclude EF, Mission Critical, AF12, etc.

The value of DSCP field 212 identifies the distinct DSCP value that isassociated with the PHB.

The Forwarding Class field 214 identifies all PHBs that belong to thesame forwarding class. A network node must preserve the order of packetsfor all PHBs having the same forwarding class. In this example, AF11 andAF12 are defined as belonging to the same forwarding class, namely class“1”.

The Default Class field 216 is a Boolean flag that is set on one andonly one PHB entry. When set, the flag value specifies that a particularPHB is the default PHB that is provided to flows having an unspecifiedPHB selector. In one embodiment, the flag is set by default for the BestEffort PHB.

The Immediate Forwarding field 218 is a Boolean flag that specifieswhether immediate forwarding of packets belonging to the particular PHBis required. In this example, the flag is set in the EF PHB.

The value of Reserved Bandwidth field 220 specifies the bandwidth inKb/sec reserved for the behavior aggregate. For PHBs that do not requireimmediate forwarding, the reserved bandwidth value determines theminimum bandwidth that is reserved for this service. In immediateforwarding PHBs, the reserved bandwidth indicates the maximum bandwidthallowed for the behavior aggregate. Such a limit on bandwidth isprovided in order to ensure that other behavior aggregates havesufficient bandwidth. In general, all PHBs with the same forwardingclass share the same reserved bandwidth value. For example, because AF11and AF12 share the same forwarding class, they also share the samereserved bandwidth value of 512 Kb/sec, as indicated by the value inFIG. 2 at the intersection of Reserved Bandwidth parameter 220 withcolumn 204.

The value of Forwarding Factor field 222 defines the percentage of thescheduling resources that may be consumed by the behavior aggregate.This is an alternative representation of the reserved bandwidth field.It allows the network administrator to manage its scheduling resourceswithout knowledge of the particular link speeds. Within a domain PHBset, an administrator may define a mixed assignment of forwardingfactors to some forwarding classes and reserved bandwidth to othersclasses.

The value of Reserved Packets field 224 defines the number of packetsthat will be reserved in a queue prior to discard. Large values allowfor sustained bursts. Within a forwarding class, this parameterspecifies the relative drop precedence of PHBs. For example, within theAF PHB group, the total number of packets reserved for A1 x is assignedto A11, while smaller numbers of packets are reserved for the A12 andA13 PHBs respectively.

The value of Buffer Factor field 226 defines a percentage of the bufferresources that are allocated for the behavior aggregate. This is analternative representation of the Reserved Packets field 224. It allowsthe network administrator to manage its buffer resources withoutknowledge of the particular queue lengths. In order to translate thebuffer factor field into an Actual Reserved Packet field, a globalparameter specifying the total buffer space should be used. The sum ofbuffer factors assigned to the forwarding classes is 100%. Within eachforwarding class, the buffer factor specifies the relative dropprecedence of the PHBs.

Packet Size field 230 is a parameter that describes the average ortypical packet size in bytes of traffic of a behavior aggregate. Thisparameter is used when there is a need to arrive at a byte countrepresentation of reserved packets field. For some schedulers adrain-size parameter is required per queue. Drain size determines thenumber of bytes a scheduler fetches from each queue in each cycle. Theratio between drain-sizes is determined by either the ratio between thereserved bandwidth field 220 or the ratio between the forwardingfactors. Drain-size values should approximate a multiple of a typicalpacket size and not be smaller than one packet size of the PHB the queueserves.

The Traffic type field 230 includes a flag that specifies whether thetraffic of the behavior aggregate is elastic or not. Most TCP traffic iselastic and can adapt to the available network resources, in the sensethat the TCP protocol can counteract drops by ensuring that droppedpackets are re-sent. RED discard mechanism is useful for elastic PHBs.An example for non-elastic traffic is UDP traffic carrying Voice. Forthis type of traffic, RED does not provide any benefit.

Maximum Per Hop Delay field 232 is a parameter that describes themaximum delay, e.g., in milliseconds, before a packet having a specifiedbehavior aggregate is forwarded. This parameter is primarily relevant toPHBs that provide immediate packet forwarding. On slow speed interfaces,this parameter allows to calculate whether fragmentation and interleaveof packets is required. Examples of such mechanisms include MPPP orFrame-Relay FRF-12 LFI. The fragment size (fs) is determined accordingto the relation:fs=max-phop-delay/link-speedIf fs is larger than link MTU, then fragmentation and interleave is notrequired.

Mapping a PHB Set to Device Configuration

The foregoing PHB set schema may be used to automatically create andstore a set of specific device configuration parameter values. In oneembodiment, a PHB set table may be transformed into another informationrepresentation that is useful in device configuration, such as a MIB,PIB, or actual configuration commands for a device.

The IETF DS working group has proposed several alternatives to model PHBconfiguration on devices, as set forth in F. Baker et al., “ManagementInformation Base for the Differentiated Services Architecture,”draft-ietf-diffserv-mib-02.txt, March, 2000, and M. Fine et al.,“Quality of Service Policy Information Base,”draft-mfine-cops-pib-01.txt, Sep. 25, 1999. The configuration of devicequeues and actions required by the models proposed in these papers canbe determined using the PHB schema described herein.

It is also possible to create queue configurations in the modelsproposed in the foregoing papers that cannot be derived from a PHB setthat is defined according to the schema described herein. However, thePHB set schema described herein is believed to adequately model viablePHBs in a simple and straightforward manner. For example, some queuemodels allow for multiple scheduling priority levels. The need for morethan two levels of priority is not known to lead to meaningful PHBs;therefore, the present schema provides only one priority level, denotedby the intermediate forwarding flag.

The following description provides an example of how queues and actionsdefined in the paper of Baker et al. can be configured. Detailedexplanations of each of the resulting fields are found in Baker et al.Queues are modeled as a set of independent FIFO queues, each having theparameters shown in Table 1.

TABLE 1 QUEUE PARAMETERS DiffServQueueEntry ::= SEQUENCE {diffServQueueNumber    INTEGER, diffServQueuemMinimumRate Unsigned32,diffServQueueMaximumRate Unsigned32, diffServQueuePriority Unsigned32,diffServQueueNextTCB RowPointer, diffServQueueStatus RowStatus }

The example PHB set defined above can be mapped to the foregoing valuesin a stored table having the values shown in Table 2.

TABLE 2 EXAMPLE PHB SET QUEUE MAPPING NAME BE AF EF diffServQueueNumber1 2 3 diffServQueuemMinimumRate 64 512 — diffServQueueMaximumRate — —512 diffServQueuePriority 0 0 1 diffServQueueNextTCB NULL NULL NULL

In order to provide full PHB configuration within the model introducedin Baker et al., the action entry configuration is also specified.Actions are modeled using the values set forth in Table 3.

TABLE 3 ACTION ENTRY PARAMETERS DiffServActionEntry ::= SEQUENCE {  diffServActionNumber INTEGER,   diffServActionNext RowPointer,  diffServActionDSCP Dscp,   diffServActionMinThreshold Unsigned32,  diffServActionMaxThreshold Unsigned32,   diffServActionDropPolicyINTEGER,   diffServActionStatus RowStatus }

The example PHB set defined above can be mapped to the model of Table 3in a stored table having the values shown in Table 4.

TABLE 4 EXAMPLE PHB SET ACTION MAPPING diffServActionNumber 1 2 3 4diffServActionNext queue1 queue2 queue3 queue4 diffServActionDSCP BEAF11 AF12 EF diffServActionMinThreshold 2pckts 4pckts 2pckts 2pcktsdiffServActionMaxThreshold 6pckts 12pckts 6pckts 2pcktsdiffServActionDropPolicy random random random tail

The foregoing mappings assume that the total-packet-in-queue parameteris set to 20 packets.

The buffer factor percentages determine maximum queue threshold values.The minimum threshold values are not determined by the PHB setspecification. When an active queue management scheme such as RandomEarly Detection (RED) is used, RED minimal thresholds may be determinedby adjusting to network behavior. For RED thresholds, the minimumthreshold value is set to be ⅓ of the maximum threshold value.Fragmentation and interleave are not modeled in the paper of Baker etal. and therefore are not included in the foregoing mapping. Determiningthreshold values provides another way of giving values to therelationship between PHBs within a class.

Some links may not be able to admit the reserved bandwidth required foreach of the forwarding classes. Therefore, the provisioning system thatis used should detect these links and report back an admission error.

Some devices may not be able to implement the PHB set defined herein,either because not enough queues are available, or because the schedulercannot provide preemptive scheduling. The provisioning system that isused should detect such deficient devices and report back an error. Inorder to provide the best approximate PHB configuration that can beimplemented on a deficient device, additional information should bespecified in the PHB set. For example, if only two queues are availableon the device, the PHB set should specify whether AF and BE should bemapped to the same queue, or whether EF and AF should.

As a result, PHBs are mapped to specific device configuration parametervalues, which may be configured in one or more devices usingconventional means, e.g., by messages that are sent to the devices froma policy decision point.

General Rules for Mapping Network Wide PHB Settings to DeviceConfiguration

According to an embodiment, functions are provided for determining thePHB configuration parameters that can be provided to a router or switchto implement the functions described herein. FIG. 3 is a flow diagram ofan example method of determining PHB configuration parameters. Theprocess of FIG. 3 may be used, for example, to translate a PHB set tableto values in a MIB, values in a PIB, or to another abstractrepresentation of a device configuration.

In block 302, PHBs are mapped to queues. In an embodiment, the mappingis not derived directly from the PHB table; rather, the following rulesare observed.

-   -   1. All PHBs within the same forwarding class are mapped to the        same queue.    -   2. The maximal number of queues required is equal to the number        of forwarding classes in the PHB table.    -   3. PHBs with immediate forwarding flag set should be mapped to a        queue served by a preemptive scheduler.    -   4. If not enough queues are available, or if not enough priority        queues are available, the network administrator must be        notified. If required, the network administrator should specify        which PHBs should be mapped to the same queue. Table 5 is an        example of a mapping of PHBs to queues for the PHB definition        provided in figure 2.

TABLE 5 EXAMPLE SERVICE TO QUEUE MAPPING Queue Type PHB Queue Number 1qBest Effort 1 AF1x 1 EF 1 1q1p Best Effort 1 AF1x 1 EF  1p 2q BestEffort 1 AF1x 1 EF 2 2q1p Best Effort 1 AF1x 2 EF  1p 3q Best Effort 1AF1x 2 EF 3

Table 5 provides an example of mappings of PHBs to queues. The queuetype column denotes the number of ordinary queues and the number ofpriority queues. The automatic rules map the PHB specification of FIG. 2into a queue type 2q1p. The maximal number of queues required is 3.Mapping to queue types 1q1p and 3q best approximate the requirements ofthe PHB set of FIG. 2, but the network administrator must be notifiedthat full PHB set cannot be enforced. Mapping of the PHBs to the 2qqueue type should be made by the network administrator, as there is noway to determine whether AF should share a queue with either BE or EF.

In block 304, drain sizes and queue sizes are calculated and stored in aweights table. A queue's drain size is the maximum number of bytes thatmay be drained from the queue in one cycle. Therefore drain sizedetermine the relative bandwidth allocation for each queue. Some devicescan be configured directly with the maximal or minimal rate each queueshould service. Transformation between representation of drain size inbytes and other representation are straightforward. A queue size is themaximal number of bytes the queue can hold. Some devices set queue sizein terms of packets. Queue size in packets can be calculated by dividingthe queue size in bytes with the largest packet size of all PHBs mappedto that queue.

In an embodiment, this table is created based on the PHB table, from themapping of PHBs to queues and from values of an additional parameter,the Total Queue size. The total queue size, measured in packets, specifythe total queue length on each of the interfaces.

In one embodiment, drain sizes and queue sizes are created as follows:

-   -   1. The queue size of a given queue is proportional to the sum of        buffer factors of all PHBs that were mapped to this queue in        block 302. The sum is a percentage of the total queue size.        Alternatively, the queue size is proportional to the sum of        reserved packets of all PHBs that were mapped to this queue in        block 302. Using the largest packet size of all PHBs mapped to        this queue one can specify the queue size in bytes.    -   2. The drain size of a given queue is proportional to the sum of        forwarding factor of all services mapped to this queue in        step 1. For each queue type, the queue with the smallest        associated forwarding factor is assigned with a drain size of        one packet size. All other queues in this queue size get a        proportional drain size, according to their forwarding factors.

An example weights table for the Service Template shown in FIG. 2 is setforth in Table 6. For purposes of Table 6 a Total Queue Size of 60Packets is assumed.

TABLE 6 EXAMPLE WEIGHTS TABLE Drain Size Queue Size Queue Type QueueNumber (bytes) (bytes) 1q 1 1500 90000 2q 1 225 81000 2 200 1200 1q1p 1225 81000  1p 200 1200 2q1p 1 1500 27000 2 12000 54000  1p 12000 1200 3q1 1500 27000 2 12000 54000 3 12000 1200

In block 306 the threshold positions within each queue are determined.If enough queues are available within the device, each forwarding classis mapped to a different queue. Each of the PHBs within a PHB group(i.e., PHBs that have the same forwarding class) should be mapped to adifferent threshold. The threshold position within this queue should beproportional to the buffer factors of each of the service, oralternatively should equal the reserved packets of each of the PHBs. Forexample, when the PHB set of FIG. 2 is mapped to a 2q1p queue type, bothAF11 and AF12 are mapped to queue 2. Queue 2 should have a singlethreshold positioned at 50% of the queue length. This threshold wouldthan enforce the buffer factor relation between AF11 and AF12. If notenough thresholds are available to differentiate between the differentPHB within a forwarding class, or if the threshold positions can not beset appropriately, the network administrator should be notified, and theclosest approximation should be taken.

When not enough queues are available, more than one forwarding class ismapped to a single queue. In this case, different PHBs can be mapped todifferent thresholds according to their different buffer factor orreserved packet fields. For example, Table 5 indicates that both AF1 andBE are mapped to the same queue in 2q queue types. Both AF12 and BE ofFIG. 2 are configured with the same buffer factor therefore can use thesame threshold. Since AF11 can utilize the whole queue, the thresholdposition should than be set to 50% in this example.

In block 308 the parameters for fragmentation and interleave mechanismsare determined. Fragmentation and interleave mechanisms are requiredonly if a PHB with a immediate forwarding flag is set, and only if thelink MTU is smaller than the fragment size (fs) determined according tothe relation:fs=max-phop-delay/link-speedIf fs is smaller than link MTU, then all packet that belong to nonimmediate forwarding PHBs should be fragmented to a size not bigger thanfs.

Hardware Overview

FIG. 4 is a block diagram that illustrates a computer system 500 uponwhich an embodiment of the invention may be implemented. The preferredembodiment is implemented using one or more computer programs running ona network element such as a router device. Thus, in this embodiment, thecomputer system 500 is a router.

Computer system 500 includes a bus 502 or other communication mechanismfor communicating information, and a processor 504 coupled with bus 502for processing information. Computer system 500 also includes a mainmemory 506, such as a random access memory (RAM), flash memory, or otherdynamic storage device, coupled to bus 502 for storing information andinstructions to be executed by processor 504. Main memory 506 also maybe used for storing temporary variables or other intermediateinformation during execution of instructions to be executed by processor504. Computer system 500 further includes a read only memory (ROM) 508or other static storage device coupled to bus 502 for storing staticinformation and instructions for processor 504. A storage device 510,such as a magnetic disk, flash memory or optical disk, is provided andcoupled to bus 502 for storing information and instructions.

A communication interface 518 may be coupled to bus 502 forcommunicating information and command selections to processor 504.Interface 518 is a conventional serial interface such as an RS-232 orRS-422 interface. An external terminal 512 or other computer systemconnects to the computer system 500 and provides commands to it usingthe interface 514. Firmware or software running in the computer system500 provides a terminal interface or character-based command interfaceso that external commands can be given to the computer system.

A switching system 516 is coupled to bus 502 and has an input interface514 and an output interface 519 to one or more external networkelements. The external network elements may include a local network 522coupled to one or more hosts 524, or a global network such as Internet528 having one or more servers 530. The switching system 516 switchesinformation traffic arriving on input interface 514 to output interface519 according to pre-determined protocols and conventions that are wellknown. For example, switching system 516, in cooperation with processor504, can determine a destination of a packet of data arriving on inputinterface 514 and send it to the correct destination using outputinterface 519. The destinations may include host 524, server 530, otherend stations, or other routing and switching devices in local network522 or Internet 528.

The invention is related to the use of computer system 500 for thetechniques and functions described herein in a network system. Accordingto one embodiment of the invention, such techniques and functions areprovided by computer system 500 in response to processor 504 executingone or more sequences of one or more instructions contained in mainmemory 506. Such instructions may be read into main memory 506 fromanother computer-readable medium, such as storage device 510. Executionof the sequences of instructions contained in main memory 506 causesprocessor 504 to perform the process steps described herein. One or moreprocessors in a multi-processing arrangement may also be employed toexecute the sequences of instructions contained in main memory 506. Inalternative embodiments, hard-wired circuitry may be used in place of orin combination with software instructions to implement the invention.Thus, embodiments of the invention are not limited to any specificcombination of hardware circuitry and software.

The term “computer-readable medium” as used herein refers to any mediumthat participates in providing instructions to processor 504 forexecution. Such a medium may take many forms, including but not limitedto, non-volatile media, volatile media, and transmission media.Non-volatile media includes, for example, optical or magnetic disks,such as storage device 510. Volatile media includes dynamic memory, suchas main memory 506. Transmission media includes coaxial cables, copperwire and fiber optics, including the wires that comprise bus 502.Transmission media can also take the form of acoustic or light waves,such as those generated during radio wave and infrared datacommunications.

Common forms of computer-readable media include, for example, a floppydisk, a flexible disk, hard disk, magnetic tape, or any other magneticmedium, a CD-ROM, any other optical medium, punch cards, paper tape, anyother physical medium with patterns of holes, a RAM, a PROM, and EPROM,a FLASH-EPROM, any other memory chip or cartridge, a carrier wave asdescribed hereinafter, or any other medium from which a computer canread.

Various forms of computer readable media may be involved in carrying oneor more sequences of one or more instructions to processor 504 forexecution. For example, the instructions may initially be carried on amagnetic disk of a remote computer. The remote computer can load theinstructions into its dynamic memory and send the instructions over atelephone line using a modem. A modem local to computer system 500 canreceive the data on the telephone line and use an infrared transmitterto convert the data to an infrared signal. An infrared detector coupledto bus 502 can receive the data carried in the infrared signal and placethe data on bus 502. Bus 502 carries the data to main memory 506, fromwhich processor 504 retrieves and executes the instructions. Theinstructions received by main memory 506 may optionally be stored onstorage device 510 either before or after execution by processor 504.

Communication interface 518 also provides a two-way data communicationcoupling to a network link 520 that is connected to a local network 522.For example, communication interface 518 may be an integrated servicesdigital network (ISDN) card or a modem to provide a data communicationconnection to a corresponding type of telephone line. As anotherexample, communication interface 518 may be a local area network (LAN)card to provide a data communication connection to a compatible LAN.Wireless links may also be implemented. In any such implementation,communication interface 518 sends and receives electrical,electromagnetic or optical signals that carry digital data streamsrepresenting various types of information.

Network link 520 typically provides data communication through one ormore networks to other data devices. For example, network link 520 mayprovide a connection through local network 522 to a host computer 524 orto data equipment operated by an Internet Service Provider (ISP) 526.ISP 526 in turn provides data communication services through the worldwide packet data communication network now commonly referred to as the“Internet” 528. Local network 522 and Internet 528 both use electrical,electromagnetic or optical signals that carry digital data streams. Thesignals through the various networks and the signals on network link 520and through communication interface 518, which carry the digital data toand from computer system 500, are exemplary forms of carrier wavestransporting the information.

Computer system 500 can send messages and receive data, includingprogram code, through the network(s), network link 520 and communicationinterface 518. In the Internet example, a server 530 might transmit arequested code for an application program through Internet 528, ISP 526,local network 522 and communication interface 518. In accordance withthe invention, one such downloaded application provides for thetechniques and functions that are described herein.

The received code may be executed by processor 504 as it is received,and/or stored in storage device 510, or other non-volatile storage forlater execution. In this manner, computer system 500 may obtainapplication code in the form of a carrier wave.

Alternatives, Extensions

The system as described herein provides for a network-wide PHBdefinition for maintaining consistent per-hop-behaviors within anetwork. By providing a schema that allows a user to define arelationship between the different PHBs that are associated with a DSdomain, a mechanism is provided whereby a network administrator can tunethe bandwidth and buffer resources allocated to each PHB within the DSdomain. Moreover, the tuning of the bandwidth and buffer resourcesallocated to each PHB may be performed without regard to the particularconfigurations of each device.

In describing certain embodiments of the invention, several drawingfigures have been used for explanation purposes. However, the inventionis not limited to any particular context as shown in drawing figures,and the spirit and scope of the invention include other contexts andapplications in which the mechanisms and processes described herein isavailable to other mechanisms, methods, programs, and processes. Thus,the specification and drawings are, accordingly, to be regarded in anillustrative rather than a restrictive sense.

For example, although not depicted in FIG. 1C, in certain embodiments,policy servers 106, 110 may be coupled to and thus communicate with edgedevices 120, 122 via network 128.

In addition, in this disclosure, including in the claims, certainprocess steps are set forth in a particular order, and alphabetic andalphanumeric labels are used to identify certain steps. Unlessspecifically stated in the disclosure, embodiments of the invention arenot limited to any particular order of carrying out such steps. Inparticular, the labels are used merely for convenient identification ofsteps, and are not intended to imply, specify or require a particularorder of carrying out such steps.

1. A method for maintaining consistent per-hop packet forwardingbehavior among a plurality of network devices in a network within aDifferentiated Services (DS) domain, comprising the computer-implementedsteps of: creating and storing a network-wide per-hop forwardingbehavior (PHB) definition that associates a PHB with a DS code point(DSCP) value, and with a set of parameters that define the bandwidth andbuffer resources allocated to the PHBs on all interfaces of networkdevices within the DS domain; determining a mapping of one or more ofthe PHBs in the network-wide PHB definition to one or more queues of thenetwork devices; determining a drain size and a queue size for each ofthe queues to which PHBs are mapped; determining a mapping of each ofthe PHBs to a threshold value associated with the queues; anddetermining parameters of fragmentation and interleave mechanism for thedevice interfaces.
 2. A method as recited in claim 1, wherein thenetwork-wide service definition further associates a service with abuffer value that identifies a percentage of buffer space that theassociated service is assured to receive in any of the network devices.3. A method as recited in claim 1, wherein the network-wide servicedefinition further associates a service with an excess action value thatidentifies a default treatment that the network devices enforce ontraffic of the associated service that exceeds an associated rate value.4. A method as recited in claim 1, wherein creating and storing anetwork-wide service definition comprises the steps of: creating andstoring a network-wide service definition that associates a firstservice and a second service in a forwarding class each with a DS codepoint (DSCP) value, with a minimum bandwidth value, and with an excessaction value that identifies a default treatment that the networkdevices enforce on traffic of the associated service that exceeds anassociated rate value, wherein the excess action value of the secondservice specifies that the network devices shall re-mark packets of thefirst service to indicate the second service.
 5. A method as recited inclaim 4, wherein a single minimum bandwidth value is specified in thenetwork-wide service definition for both the first service and thesecond service.
 6. A method as recited in claim 5, wherein the buffervalue for the first service is less than the buffer value for the secondservice.
 7. A method as recited in claim 1, wherein the step ofdetermining a mapping of one or more of the services in the network-wideservice definition to one or more queues of the network devicescomprises the steps of mapping all services within the same forwardingclass to the same queue, such that a maximum number of queues is equalto a number of the forwarding classes.
 8. A method as recited in claim1, wherein the step of determining a drain size and a queue size foreach of the queues to which services are mapped comprises the steps of:receiving a packet size value representing a typical packet size inbytes; receiving a total queue size value representing a total queuelength in packets on all interfaces; determining the queue size value inproportion to a sum of buffer values of all services that are mapped tothe associated queue as compared to the total queue size value;determining the drain size value, wherein a queue having a smallestassociated transmit value is assigned a drain size value equal to thepacket size and all other queues are assigned a drain size value inproportion to a sum of all transmit values of all services that aremapped to the associated queue.
 9. A method as recited in claim 1,wherein mapping services to thresholds comprises the steps of mappingeach service among a plurality of services within a forwarding class todifferent thresholds; and mapping services that require that packets arenot discarded even if delayed to a higher threshold when there are notenough queues to separate among flows of services.
 10. A method asrecited in claim 1, wherein determining threshold position valuescomprises the steps of: for any forwarding class having more than oneservice, determining the threshold positions based on the buffer valuesof one service in the forwarding class that has an excess action valueindicating remarking.
 11. A method as recited in claim 1, whereindetermining threshold position values comprises the steps of determininga first threshold position value for a first service of a forwardingclass and determining a second threshold position value for a secondservice of the forwarding class based on a ratio of the buffer values ofthe first service and the second service.
 12. A computer-readable mediumcomprising one or more sequences of instructions for maintainingconsistent per-hop packet forwarding behavior PHB among a plurality ofnetwork devices in a network within a Differentiated Services (DS)domain, which instructions, when executed by one or more processors,cause the one or more processors to carry out the steps of: creating andstoring a network-wide PHB definition that associates a PHB with a DScode point (DSCP) value, and with a set of parameters that define thebandwidth and buffer resources allocated to the PHBs on all interfacesof network devices within the DS domain; determining a mapping of one ormore of the PHBs in the network-wide service definition to one or morequeues of the network devices; determining a drain size and a queue sizefor each of the queues to which PHBs are mapped; determining a mappingof each of the PHBs to a threshold value associated with the queues; anddetermining parameters for fragmentation and interleave mechanisms forthe device interfaces.
 13. The computer-readable medium as recited inclaim 13 wherein the network-wide service definition further associatesa service with a buffer value that identifies a percentage of bufferspace that the associated service is assured to receive in any of thenetwork devices.
 14. An apparatus for maintaining consistent per-hoppacket forwarding behavior PHB among a plurality of network devices in anetwork within a Differentiated Services (DS) domain, comprising: meansfor creating and storing a network-wide PHB definition that associates aPHB with a DS code point (DSCP) value, and with a set of parameters thatdefine the bandwidth and buffer resources allocated to the PHBs on allinterfaces of network devices within the DS domain; means fordetermining a mapping of one or more of the PHBs in the network-wide PHBdefinition to one or more queues of the network devices; means fordetermining a drain size and a queue size for each of the queues towhich PHBs are mapped; means for determining a mapping of each of thePHBs to a threshold value associated with the queues; and means fordetermining parameters of fragmentation and interleave mechanisms forthe device interfaces.
 15. The apparatus recited in claim 14, whereinthe network-wide service definition further associates a service with abuffer value that identifies a percentage of buffer space that theassociated service is assured to receive in any of the network devices.16. An apparatus for maintaining consistent per-hop packet forwardingbehavior PHB among a plurality of network devices in a network within aDifferentiated Services (DS) domain, comprising: a processor; a networkinterface coupled to the processor for communicating data packets amongthe network devices and the processor; a memory having one or morestored sequences of instructions which, when executed by the processor,cause the processor to carry out the steps of: creating and storing anetwork-wide PHB definition that associates a PHB with a DS code point(DSCP) value, and with a set of parameters that define the bandwidth andbuffer resources allocated to the PHBs on all interfaces of networkdevices within the DS domain; determining a mapping of one or more ofthe PHBs in the network-wide PHB definition to one or more queues of thenetwork devices; determining a drain size and a queue size for each ofthe queues to which PHBs are mapped; determining a mapping of each ofthe PHBs to a threshold value associated with the queues; anddetermining parameters of fragmentation and interleave mechanism for thedevice interfaces.
 17. The apparatus recited in claim 16, wherein thenetwork-wide service definition further associates a service with abuffer value that identifies a percentage of buffer space that theassociated service is assured to receive in any of the network devices.18. A method for maintaining consistent per-hop packet forwardingbehavior PHB among a plurality of network devices in a network within aDifferentiated Services (DS) domain, comprising the computer-implementedsteps of: creating and storing a network-wide PHB definition thatassociates a PHB with a DS code point (DSCP) value, and with a set ofparameters that define the bandwidth and buffer resources allocated tothe PHBs on all interfaces of network devices within the DS domain;mapping each of the PHBs in the network-wide PHB definition to one ormore queues of the network devices; creating and storing one or morenetwork device configuration parameter values based on the mapping;sending the configuration parameter values to each of the networkdevices within the DS domain; wherein the network-wide servicedefinition further associates a service with an excess action value thatidentifies a default treatment that the network devices enforce ontraffic of the associated service that exceeds an associated rate value.19. A method as recited in claim 18, wherein the mapping step comprisesmapping all PHBs within the same forwarding class to the same queue,such that a maximum number of queues is equal to a number of theforwarding classes.